Character Extraction from Interfering Background - Analysis of Double-Sided Handwritten Archival Documents

نویسندگان

  • Chew Lim Tan
  • Ruini Cao
  • Qian Wang
  • Peiyi Shen
چکیده

The sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage poses a serious problem to human readers or OCR systems. This paper addresses this problem through the recovery of content on the front side of a page from the interfering image caused by the handwriting on the reverse side. First, by adapting the Gaussian stochastic model, the interfering model based on norm-orientation-discontinuity is proposed in analyzing the properties of the interfering strokes. Secondly, an improved canny edge detector with edge norm-orientation similarity constraint is proposed. At the same time, two low thresholds are used to detect edges instead of a single low threshold. This improvement could link weaker foreground edges without introducing noises in the overlapping/overshadowed area. The proposed algorithms perform well regardless of the intensity differences between the image on the front side and the interfering image from the reverse side. The segmentation results of real images are shown and evaluated

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmentation and Analysis of Double-Sided Handwritten Archival Documents

Historical handwritten documents are preserved in good condition in many national archives or libraries. One problem that many archivists are facing is the sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage. This paper addresses this problem and develops a novel algorithm to extract clear textual images from interfering and overlapping a...

متن کامل

Text Extraction from Historical Handwritten Documents by Edge Detection

Many national archives or libraries keep large amount of historical handwritten documents. One problem that many archivists are facing is the sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage. The result is that the handwritten characters from the reverse side appear as noise on the front side and even interfere with the front side char...

متن کامل

A wavelet approach to double-sided document image pair processing

In this paper, we present a novel method for processing double-sided historic handwritten documents using wavelets. The method is specially designed to remove the interfering strokes from the reverse side due to ink sipping through pages after long periods of storage. The proposed method works by first matching both sides of a document page such that the interfering strokes are mapped with the ...

متن کامل

Restoration of Archival Documents Using a Wavelet Technique

This paper addresses a problem of restoring handwritten archival documents by recovering their contents from the interfering handwriting on the reverse side caused by the seeping of ink. We present a novel method that works by first matching both sides of a document such that the interfering strokes are mapped with the corresponding strokes originating from the reverse side. This facilitates th...

متن کامل

Characters Extraction from Strings on a Document Image Using Handwritten Marks on Touch Screen

We argued validity of Tablet PCs in the fields of E-learning, and this paper discussed a new character extraction system from strings on the document image using handwritten marks on touch screen. As the first step of this study, we proposed the method to identify handwritten notes/marks to associate strings on the documents with handwritten marks. In this paper, experiments using actual scanne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001